feat(prefilter): registry for precomputed filter bitmaps#207
Merged
Conversation
Registers named clause sets whose bitmap gets precomputed once and cached.
When a query's canonicalized clause set is a superset of a registered
prefilter's clauses, those clauses are elided from the query and replaced
by a single BucketBitmap AND against the cached bitmap.
For the Civitai safety prefix (7 clauses + a compound NOT) this turns ~900ms
of filter work into one intersect. See docs/_in/design-prefilter-registry.md.
- src/prefilter.rs: PrefilterRegistry (DashMap<name, Arc<PrefilterEntry>>),
PrefilterEntry (ArcSwap<RoaringBitmap> + atomic stats), canonicalizer
(sort In values, flatten And, collapse Not-Not, sort/dedupe top level),
longest-match subset detection, substitute() helper. 15 unit tests.
- src/concurrent_engine.rs: registry lives on ConcurrentEngine; exposes
prefilters(), register_prefilter(), refresh_prefilter(),
unregister_prefilter(). resolve_filters{_traced} substitutes inline.
- src/server.rs: POST/GET/DELETE /api/indexes/{name}/prefilters and a
POST /{prefilter_name}/refresh for manual re-compute.
- src/query_metrics.rs: QueryTrace.prefilter records the substituted name.
- tests/prefilter_integration.rs: end-to-end — register, verify
substitution fires, confirm result-set identity, verify refresh
updates cardinality.
silo_copy (pre-existing broken, references removed DocStoreV3) is gated
behind the new legacy-docstorev3 feature so it stops breaking default
builds. Task #132 tracks its full removal.
No background refresh thread yet — re-register via POST or call
/refresh. Background SWR is the Phase 1b follow-up.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
JustMaier
added a commit
that referenced
this pull request
Apr 14, 2026
Review findings from Gemini + GPT + Plan Reviewer on PR #207 surfaced two critical bugs plus a few design-doc deviations worth documenting. 1. Deleted-doc resurrection (Gemini, CRITICAL) The engine uses clean-delete: on delete, the doc's bits are cleared from every filter/sort bitmap AND from alive, so filter bitmaps stay clean and the query hot path has no `alive AND filter`. The cached prefilter bitmap is NOT maintained by clean-delete — it's a frozen snapshot. Between refreshes, deleted docs still have bits set in the cached prefilter. A query whose clauses match the prefilter (no extra clauses to narrow against clean filter bitmaps) would have acc = stale bitmap → deleted docs leak through. Fix: substitute() now takes an `&RoaringBitmap alive` argument and intersects the cached prefilter bitmap with alive before wrapping it in a BucketBitmap clause. Cost: one bitmap AND (~20-50ms at 107M) per substituted query — still a large win vs the 300-900ms saved by eliding the original clauses. Verified by: - unit test substitute_excludes_dead_slots_from_stale_prefilter - integration test deleted_docs_excluded_from_stale_prefilter (registers prefilter, deletes 3/10 docs without refresh, confirms substituted query returns 7 IDs excluding the deleted slots) 2. Tokio reactor starvation (Gemini, CRITICAL) handle_register_prefilter and handle_refresh_prefilter called engine.register_prefilter / engine.refresh_prefilter directly on the Axum async worker thread. compute_filters over 100M-bit bitmaps is ~300-900ms of CPU work — running that on a reactor thread starves the runtime and tanks p99 for every other request. Fix: wrap both in tokio::task::spawn_blocking. Errors propagate as JoinError → 500 Internal Server Error with message. No other behavioral change. 3. Or-children canonicalization (Plan Reviewer, low priority) Design open question #1: whether Or(A, B) and Or(B, A) should canonicalize identically. Current answer is "no" — we don't reorder Or children. Added unit test canonicalize_does_not_reorder_or_children as a regression backstop: if a future change enables Or sorting, the test must be updated deliberately. 4. Documented deviations + Phase 1 hazards in the design doc - Registry on ConcurrentEngine, not InnerEngine (reason: avoid snapshot publish on registration; clause sets are immutable so no coherence issue) - Phase 1b flag gate skipped (empty-registry early-return is the kill switch) - Cache/prefilter drift bifurcation: cache-hit path and cache-miss path may return slightly different results under drift; bounded by refresh interval; acceptable for Phase 1 per design's explicit "mild drift is acceptable"; Phase 2 includes prefilter version in cache key - Delete-drift / alive-intersection rationale Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3 tasks
JustMaier
added a commit
that referenced
this pull request
Apr 14, 2026
Review findings from Gemini + GPT + Plan Reviewer on PR #207 surfaced two critical bugs plus a few design-doc deviations worth documenting. 1. Deleted-doc resurrection (Gemini, CRITICAL) The engine uses clean-delete: on delete, the doc's bits are cleared from every filter/sort bitmap AND from alive, so filter bitmaps stay clean and the query hot path has no `alive AND filter`. The cached prefilter bitmap is NOT maintained by clean-delete — it's a frozen snapshot. Between refreshes, deleted docs still have bits set in the cached prefilter. A query whose clauses match the prefilter (no extra clauses to narrow against clean filter bitmaps) would have acc = stale bitmap → deleted docs leak through. Fix: substitute() now takes an `&RoaringBitmap alive` argument and intersects the cached prefilter bitmap with alive before wrapping it in a BucketBitmap clause. Cost: one bitmap AND (~20-50ms at 107M) per substituted query — still a large win vs the 300-900ms saved by eliding the original clauses. Verified by: - unit test substitute_excludes_dead_slots_from_stale_prefilter - integration test deleted_docs_excluded_from_stale_prefilter (registers prefilter, deletes 3/10 docs without refresh, confirms substituted query returns 7 IDs excluding the deleted slots) 2. Tokio reactor starvation (Gemini, CRITICAL) handle_register_prefilter and handle_refresh_prefilter called engine.register_prefilter / engine.refresh_prefilter directly on the Axum async worker thread. compute_filters over 100M-bit bitmaps is ~300-900ms of CPU work — running that on a reactor thread starves the runtime and tanks p99 for every other request. Fix: wrap both in tokio::task::spawn_blocking. Errors propagate as JoinError → 500 Internal Server Error with message. No other behavioral change. 3. Or-children canonicalization (Plan Reviewer, low priority) Design open question #1: whether Or(A, B) and Or(B, A) should canonicalize identically. Current answer is "no" — we don't reorder Or children. Added unit test canonicalize_does_not_reorder_or_children as a regression backstop: if a future change enables Or sorting, the test must be updated deliberately. 4. Documented deviations + Phase 1 hazards in the design doc - Registry on ConcurrentEngine, not InnerEngine (reason: avoid snapshot publish on registration; clause sets are immutable so no coherence issue) - Phase 1b flag gate skipped (empty-registry early-return is the kill switch) - Cache/prefilter drift bifurcation: cache-hit path and cache-miss path may return slightly different results under drift; bounded by refresh interval; acceptable for Phase 1 per design's explicit "mild drift is acceptable"; Phase 2 includes prefilter version in cache key - Delete-drift / alive-intersection rationale Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
JustMaier
added a commit
that referenced
this pull request
Apr 14, 2026
Wires the 5 metrics from design-prefilter-registry.md plus one extra age gauge into the collect-on-scrape path. Atomics were already tracked on PrefilterEntry from PR #207; this PR adds scrape-side plumbing so Grafana can show registered count, cardinality, substitution rate, last compute duration, refresh errors, and age-since-refresh. - bitdex_prefilter_registered{index} — gauge, count of registered entries - bitdex_prefilter_cardinality{index,name} — gauge, current bitmap size - bitdex_prefilter_substitutions_total{index,name} — counter surfaced as gauge (matches existing cache_hits_total pattern — the value comes from an AtomicU64 counter on the entry, not from a live Prometheus counter) - bitdex_prefilter_last_compute_seconds{index,name} — gauge - bitdex_prefilter_refresh_errors_total{index,name} — gauge - bitdex_prefilter_age_seconds{index,name} — gauge, alert candidate: if an SWR thread or orchestrator isn't refreshing, age will grow unbounded Tests: lib 17/17 + integration 3/3 still green. No behavior change to the substitute() hot path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This was referenced Apr 14, 2026
JustMaier
added a commit
that referenced
this pull request
Apr 14, 2026
Wires the 5 metrics from design-prefilter-registry.md plus one extra age gauge into the collect-on-scrape path. Atomics were already tracked on PrefilterEntry from PR #207; this PR adds scrape-side plumbing so Grafana can show registered count, cardinality, substitution rate, last compute duration, refresh errors, and age-since-refresh. - bitdex_prefilter_registered{index} — gauge, count of registered entries - bitdex_prefilter_cardinality{index,name} — gauge, current bitmap size - bitdex_prefilter_substitutions_total{index,name} — counter surfaced as gauge (matches existing cache_hits_total pattern — the value comes from an AtomicU64 counter on the entry, not from a live Prometheus counter) - bitdex_prefilter_last_compute_seconds{index,name} — gauge - bitdex_prefilter_refresh_errors_total{index,name} — gauge - bitdex_prefilter_age_seconds{index,name} — gauge, alert candidate: if an SWR thread or orchestrator isn't refreshing, age will grow unbounded Tests: lib 17/17 + integration 3/3 still green. No behavior change to the substitute() hot path. Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Ships the prefilter registry from
docs/_in/design-prefilter-registry.md: named, precomputed filter bitmaps that the planner substitutes into matching queries. For the Civitai feed's 7-clause safety prefix (compound NOT alone = ~588 ms/query), this turns ~900 ms of filter work into a single bitmap intersect.What's in this PR (Phase 1a + 1c substitution)
src/prefilter.rs— new module (~430 lines):PrefilterRegistry(DashMap<name, Arc>), capped at 32 entriesPrefilterEntrywithArcSwap<RoaringBitmap>+ atomic statsIn/NotInvalues, flattenAnd, collapseNot(Not(X)), sort + dedupe top levelsubstitute()helpersrc/concurrent_engine.rs— registry lives onConcurrentEngine:prefilters(),register_prefilter(),refresh_prefilter(),unregister_prefilter()resolve_filters{_traced}callsprefilter::substitute()inline before the plannersrc/server.rs— endpoints (admin-gated):POST /api/indexes/{name}/prefilters— register + computeGET /api/indexes/{name}/prefilters— listDELETE /api/indexes/{name}/prefilters/{prefilter_name}— unregisterPOST /api/indexes/{name}/prefilters/{prefilter_name}/refresh— manual recomputesrc/query_metrics.rs—QueryTrace.prefilterrecords the substituted name for observabilitytests/prefilter_integration.rs— end-to-end: register, run matching + non-matching queries, confirm result-set identity and trace signal; refresh updates cardinality after new insertsCargo.toml— gatessilo_copy(pre-existing broken, references removedDocStoreV3) behind a newlegacy-docstorev3feature so it stops blocking default builds. Task feat: range scan key enumeration from BitmapSilo #132 tracks its full removal.What's NOT in this PR
/refreshto update.prefilter_registered_total,prefilter_substitutions_total,prefilter_cardinality,prefilter_refresh_seconds. The atomics are inPrefilterEntry; just need the scrape-side wiring. Following up separately.Correctness
Substitution is an exact subset match on canonicalized clauses, so
compute(A ∧ B ∧ C ∧ rest)becomesprefilter_bitmap(A ∧ B ∧ C) ∧ rest— trivially equivalent modulo bitmap freshness (drift bounded by refresh interval). The integration test verifies result-set identity between the substituted and non-substituted paths.Test plan
cargo test --lib prefilter::— 15 unit tests (canonicalization, subset match, registry CRUD, max-entry cap, substitution shape)cargo test --test prefilter_integration— 2 end-to-end tests passing in ~5 scivitai_safe(Civitai safety prefix), run feed queries, verifyfilter_usdrops 300–800 ms with identical result setsRollout
civitai_safe; measure filter_us drop.🤖 Generated with Claude Code